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Abstract 



Probabilistic inequalities are important ingredients of fundamental probabilisty theory. A classi- 



1 In this paper, we propose a new approach for deriving probabilistic inequalities. Our main 

p I ■ idea is to exploit the information of underlying distributions by virtue of the monotone like- 

^ (— I lihood ratio property and Berry-Essen inequality. Unprecedentedly sharp bounds for the tail 

probabilities of some common distributions arc established. The applications of the proba- 
bilistic inequalities in parameter estimation arc discussed. 

^ ■ 1 Introduction 

<N 
00 

cal approach for deriving probabilistic inequalities is based on the moment or moment generating 
functions of relevant random variables. In view of the fact that the moment generating function 
is actually a moment function in a general sense, we call this approach as Method of Moments. 
Many well-known inequalities such as Markov inequality, Chebyshev inequality, Chernoff bounds 
j5] , Hoeffding [7] inequalities are developed in this framework. In order to use the method of mo- 
! ments to derive probabilistic inequalities, a critical step is to obtain a closed-form expression for 

the moment or moment generating function. However, for some common distributions, the mo- 
ment or moment generating function may be either unavailable or too complicated for analytical 
treatment. Familiar examples are Student's t-distribution, Snedecor's F-distribution, hypergeo- 
metric distribution, hypergeometric waiting-time distribution, for which the method of moments 
is not useful for deriving sharp bounds for tail probabilities. In addition to this limitation, another 
drawback of the method of moments is that the information of the underlying distribution may 
not be fully exploited. This is especially true when the relevant distribution is analytical and 
known. 



*The author had been previously working with Louisiana State University at Baton Rouge, LA 70803, USA, 
and is now with Department of Electrical Engineering, Southern University and A&M College, Baton Rouge, LA 
70813, USA; Email: chenxinjia@gmail.com 
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In this paper, we take a new path to derive probabilistic inequalities. In order to overcome the 
limitations of the method of moments, we exploit the information of underlying distribution by 
virtue of the statistical concept of Monotone Likelihood Ratio Property (MLRP). We discovered 
that, the MLRP is extremely powerful for deriving sharp bounds for the tail probabilities of a large 
class of distributions. Specially, in combination of the Berry-Essen inequality, the MLRP can be 
employed to improve upon the Chernoff-Hoeffding bounds for the tail probabilities of the expo- 
nential family by a factor about two. For common distributions such as Student's ^distribution, 
Snedecor's i^-distribution, hypergeometric distribution, hypergeometric waiting-time distribution, 
we also obtained unprecedentedly sharp bounds for the tail probabilities. We demonstrate that 
the MRLP can be used to illuminate probabilistic phenomenons with very elementary knowledge. 

The remainder of the paper is organized as follows. In Section 2, we present our most general 
results, especially the Likelihood Ratio Bounds (LRB). Section 3 gives bounds on the distribution 
of likelihood ratio. In Section 4, we develop a unified theory for bounding the tail probabilities of 
the exponential family. In Section 5, we apply our general theory to obtain tight bounds for the 
tail probabilities of common distributions. In Section 6, we explore the general applications of 
the probabilistic inequalities for parameter estimation. Section 7 is the conclusion. Throughout 
this paper, we shall use the following notations. The set of real numbers is denoted by R. The 
set of integers is denoted by Z. We use the notation Pr{. | 9} to indicate that the associated 
random samples X\, X2, ■ ■ ■ are parameterized by 9. The parameter 9 in Pr{. | 9} may be dropped 
whenever this can be done without introducing confusion. The expectation of a random variable 
is denoted by E[.]. The notation Iz denotes the support of Z. The other notations will be made 
clear as we proceed. 

2 Likelihood Ratio Bounds 

The statistical concept of monotone likelihood ratio plays a central role in our development of 
new probabilistic inequalities. Before presenting our new results, we shall describe the MLRP 
as follows. Let Xi, X2, • • • ,X n be a sequence of random variables defined in probability space 
(f2, Pr) such that the joint distribution of X%, • • • , X n is determined by parameter 9 in 0. Let 
f n (x±, • • • , x n ; 9) be the joint probability density function for the continuous case or the probability 
mass function for the discrete case, where (x±, • • • , x n ) denotes a realization of (X±, • • • , X n ). The 
family of joint probability density or mass functions is said to posses MLRP if there exist a 
nonnegative multivariate function A(z,$0j$i) of z G 3? , i?o £ ; $1 S and a multivariate 
function ip = <p(xi, • • • , x n ) of x\, • ■ ■ ,x n such that the following requirements are satisfied. 

(I) (p = (p(x\, • • • , x n ) takes values in 3f for arbitrary realization, (xi, • • • , x n ),of(Xi,... ,X n ). 

(II) For arbitrary parametric values Oq, 9\ € 0, the function A(z, 9q, 9\) is non-decreasing with 
respect to z £ 3f provided that 9q < 9\. 

(III) For arbitrary parametric values 9q,9\ £ 0, the likelihood ratio '^'-aj) can ^ e 
expressed as A(ip, 9q,9i). 
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Now we are ready to state our general results as Theorem Q] in the following. 

Theorem 1 Let <p = (p(X\,--- ,X n ). Let 'd(z) be a function of z £ 2? taking values in 0. 
Suppose the monotone likelihood ratio property holds. Define ^(z,9) = A(z,$(z),9) for z G 3f 
and 6 G 0. Then, 

Pr{ip >z\9}< Jt{z, 9) x Pr{y» > z \ ${z)} < J((z, 9) (1) 

for z G i2° such that $(z) is no less than 9 G 0. Similarly, 

Pv{(p <z\9}< Jt(z, 9) x Pr{<p < z \ <&(z)} < J?(z, 9) (2) 

for z G 2£ such that "&{z) is no greater than 9 G 0. 

Assume that the following additional assumptions are satisfied: 

(a) "d(z) = z for any z G 0; 

(b) fn(xi, • • • , x n ; 9) can be expressed as a function g(ip, 9) of tp = ip(x±, ■ ■ ■ , x n ) and 9; 

(c) g(z,9) is non- decreasing with respect to 9 G no greater than z G and is non-increasing 
with respect to 9 G no less than z G 0. 

Then, the following statements hold true: 

(i) Jt{z, 9) = A(z, z,9) = 9 ^ for z, 9 G ©. 

(ii) ^{z,9) is non- decreasing with respect to 9 G no greater than z G and is non- 
increasing with respect to 9 G no less than z G 0. 

(Hi) ^(z,9) is non- decreasing with respect to z G no greater than 9 G and is non- 
increasing with respect to z G no less than 9 G 0. 

The proof of Theorem [1] is provided in Appendix [A] Since inequalities (pQ) and ([2]) are derived 
from the MLRP, these inequalities are referred to as the Likelihood Ratio Bounds in this paper 
and its previous version [3]. 

An immediate application of Theorem [1] can be found in the area of statistical hypothesis 
testing. It is a frequent problem to test hypothesis J#q : 9 < 9q versus Ji?i : 9 > 9±, where 6>o < 9\ 
are two parametric values in 0. Assume that there is a statistic 9 defined in terms of X\, ■ ■ ■ , X n 
such that the probability ratio f \xi '- '/'^j can be expressed as A(9, 9o, 9±), which is increasing 
with respect to 9. To test the hypotheses, a classical method is to choose a number 7 G such 
that #0 < 7 < 9\ and make the decision: Accept if 9 < 7 and otherwise reject J%. To offer 
simple bounds for the risks of making an erroneous decision, we have obtained the following new 
result: 

Pr{Reject J^ \ J%o} < A(j,j,9 ), Pr{Reject J#l | J#{} < A(>y, 7, (3) 
To prove ([3]), note that 

Pr{ Reject JT | J^} = Vx{9 > 7 | Jf } < Pr{9 > 7 | 9 } < A( 7 ,7^o), 
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where the first inequality is due to the monotonicity of the likelihood ratio, and the second 
inequality is a consequence of Theorem Q] Similarly, 

Pr{ Reject \ M{} = Pr{0 < 7 | J^} < Pr{9 < 7 | 0{\ < A( 7 ,7,#i). 

It can be checked that such bounds apply to the exponential family and hypergeometric distribu- 
tion. 



3 Bounds on the Distribution of Likelihood Ratio 

Let fx( x 'j 8) denote the probability density (or mass) function of X parameterized by 9 € 0. Let 
Xi,X2, • • • be i.i.d. samples of X. Consider hypothesis ffi : 9 = 6q. Assume that for a sample 
of size ri, there exists a maximum likelihood estimator (MLE) 9 n for #0 such that the sequence of 
estimators 6 n , n = 1, 2, • • • converges in probability to 9q. Define likelihood ratio 

A WUfxixM n = 12 

m=ifx(x i; e n y 

Assume that 9 n is asymptotically normally distributed with mean 9q. in this setting, Wilks proved 
that 

1 r* 2 1 



lim Pr{-21n \^ < \ I #0} = t= / u ^e ^du 

n ^°° V 27T Jo 



that is, if ffl is true, —2 In A^f , n = 1, 2, • • • converges in distribution to the chi-square distribution 
of degree one. The proof of this result can be found in pages 410-411 of Wilks' text book 
Mathematical Statistics. This result has important application for testing hypothesis Jt? : 9 = 9q. 
Suppose the decision rule is that J^f is rejected if — 21nAjf > x a i where x a * s the number for 
which Pr{x 2 > Xa} = a - Then, lim n _ s . 00 PrjReject Jif \ = a. 

The drawback of the asymptotic result is that it is not clear how large the sample size n is 
sufficient for the asymptotic distribution to be applicable. To address this issue, it is desirable to 
obtain tight bounds for the distribution of — 21nAjf . For this purpose, we can apply Theorem [1] 
to derive the following results. 

Theorem 2 Let a be a positive number and n be a positive integer. Let f n {x\, ■ ■ ■ ,x n ;8) denote 
the joint probability density or mass function of random variables Xi, ■ ■ ■ ,X n parameterized by 
6 G B. Assume that f (xY"~ \x n '-e ) can ^ e ex P resse d as a function, A((p n ,9o,9\), of 9q,9\ and 
<Pn — ¥>(Xi, ■ ■ ■ ,X n ) such that A(ip n , 9 , 9\) is increasing with respect to <p n . Let 9 n be a function 
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for 9 € 0. Moreover, under additional assumption that 9 n is a MLE for 9, the following inequal- 
ities 
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hold true for arbitrary nonempty subset 5? of and all 6 G 5^ . 

See Appendix[B]for a proof. To apply inequalities (HI)-©, there is no necessity for X±, 
to be i.i.d. and 9 n to be a MLE for 9. Applying Theorem [2] to the likelihood ratio 

fn(Xi, ■ ■ ■ , X n ; 6*o) 



(7) 
(8) 
(9) 



, x n 



A 



yields 



fn(Xi, • • • , X n ; 9 n ) 
Pr{-21nAjr > X 2 I &o} < 2exp 
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As a by product, we have proved the inequality 

i f 00 1 

1 / 1_ u 

u 2 e 2 du < 2 exp 



z > 0. 



With regard to testing hypothesis Jif : 9 = 9q, if the decision rule is to reject Jif when \jg> < ^, 
then 

Pr {Reject Jf | } = Pr {a^ < | | O } < a. 
Since the acceptance region is 

it follows that inverting the acceptance region leads to a confidence region for 9 with coverage 
probability no less than 1 — a. Specially, if we define random region 



St 



□ r> fn(Xi, ■ ■ ■ ,X n ;9 ) a 
?o t W : - — > — 

fn{Xl, • • • , X n ] 9 n ) * 
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then Pr{<9 G 3t \ 9} > 1 - a for all G 9. It can be shown that & is actually an interval if 6 n is 
a MLE for 9. We will return to the problem of interval estimation later. 

4 Probabilistic Inequalities for Exponential Family 

Our main objective for this section is to develop a unified theory for bounding the tail proba- 
bilities of the exponential family. A single-parameter exponential family is a set of probability 
distributions whose probability density function (or probability mass function, for the case of a 
discrete distribution) can be expressed in the form 

f x (x, 9) = h(x) eMn(9)T(x) - A(9)), 9 G 9 (10) 

where T{x), h(x),n(9), and A{9) are known functions. 

For the exponential family described above, we have the following results. 

Theorem 3 Let X be a random variable with probability density function or probability mass 
function defined by tlO\). Let X\,--- ,X n be i.i.d. samples of X. Define = =1 n and 
^(z, 9) = [ exp(^(f jz-A(z)j f or z -> ® e ®- Suppose that ^jp- is positive for 9 G 9. Then, 

Pr {<? > z | (9 j < J((z, 9) x Pr {# > z | z j for z G 9 no less than 9 G 9 

and 

Pr j# < z | 6> j < ^(z, 6*) x Pr j# < z | z j for z G 9 no greater than 9 G 9. 

Moreover, under the additional assumption that dA }P = 9 dri jP , the following statements hold true: 

(i) is a maximum-likelihood and unbiased estimator of 9. 

(ii) ^(z,9) = infjgRE exp [nt{6 — z)^j , where the infimum is attained at t = r](z) — rj(9). 
(Hi) ^(z,9) is increasing with respect to 9 G 9 no greater than z G 9 and is decreasing with 

respect to 9 G 9 no less than z G 9. 

(iv) ^(z,9) is increasing with respect to z G 9 no greater than 9 G 9 and is decreasing with 
respect to z G 9 no less than 9 G 9. 

(«) 

Pl j§ > , | a < 1 + gg nnx) - zn < i gBgEiflnx) - & 

I J 2 VnE2[\T(X)-z\ 2 ] 2 V™ R*[\T(X) - z\ 2 ] 

„ i 1 C BE Ef|T(X) - z\ 3 } 1 C B e eI[|T(X) - z| 4 l , . 

Pr j G < z \z > < - + 7T " v ' — < - + -==■ , — , 12 

I J 2 E§[|T(X) -z| 2 ] 2 v^Ei[|T(X)-z| 2 ] 

where the expectation is taken with 9 = z and Cbe is the absolute constant in the Berry-Essen 
inequality. 



The proof of Theorem [3] is given in Appendix By the assumption that ij(9) is increasing 
with respect to 9, it follows from statement (ii) that 



inf t<0 E 
inf t> E 



exp \ nt{0 — z] 
exp (nt(0 — z\ 



for z < 9, 
for z > 9 



This implies that the likelihood ratio bound coincides with Chernoff bound for the exponential 
family. 

Theorem [3] involves the famous Berry-Essen inequality [HE], which asserts the following: 
Let Y\,Y2, ... be i.i.d. samples of random variable Y such that K[Y] = 0, E[Y 2 ] > 0, and 

E y < oo. Also, let F n be the cdf of , i= \ I , and $ the cdf of the standard normal distribution. 

Then, there exists a positive constant Cbe such that for all y and n, 



E 3/2[y2]- 

A few years ago, Shevtsova [8] proved that the constant Cbe < 0.7056 < More recently, 
Tyurin [§] has shown that C BE < 0.4785 < ±. 

5 Bounds of Tail Probabilities 

In this section, we shall apply our general results to derive sharp bounds for the tail probabilities 
of some common distributions. 

5.1 Binomial Distribution 

The probability mass function of a Bernoulli random variable, X, of mean value p S (0, 1) is given 
by 

f(x,p) = Pi{X = x \p}= p x (l - p) 1 '* = h(x) exp (r]{p)T(x) - A(p)) , x £ {0, 1} 

where 

p 1 

T(x) = x, h{x) = 1, n(p) = In , A(p) = In ■ 



1 — p 1 — p 

dp dp 



Since dA }^ = A dri j^ holds, making use of Theorem O we have the following results. 



Corollary 1 Let Xi, ••• ,X n be i.i.d. samples of Bernoulli random variable X of mean value 
p e (0,1). Define JZ{z,p) = z In f + (1 - z ) l n ±=f for z e (0,1) and p £ (0,1). Then, 

Pr j^Xj > nzj < Q + A^j exp(n^#(z,p)) /or z G (p, 1), 
Pr - nz | - Q + exp(n^#(z,p)) /or z E (0,p), 
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where 

. . Ji c BE [* 2 + (i-*) 2 ] l 

A = mm < - , > . 

\2' ^nz(l - z) j 

An important application of Corollary [JJ can be found in the determination of sample size for 
estimating binomial parameters. Let Xi,X%,- ■ • be i.i.d. samples of Bernoulli random X such 
that Pr{X = 1} = 1 - Pr{X = 0} = p £ (0, 1). Define p n = ^ Xl . A classical problem in 
probability and statistics theory is as follows: 

Let e £ (0, 1) and 5 £ (0, 1) be the margin of absolute error and the confidence parameter 
respectively. How large n is sufficient to ensue 

Pr{|p n -p| <e} > 1-8 (13) 

for any p € (0, 1)? The best explicit bound so far is the well-known Chernoff-Hoeffding bound 
which asserts that ()13|) is guaranteed for any p £ (0, 1) provided that 

By virtue of Corollary [lj we have obtained better explicit sample size bound as follows. 

Theorem 4 Let < e < § and < 6 < 2exp (- ^l^y- ) ■ Then, Pr{|p - p\ < £ \ p} > 1 - 5 for 
any p £ (0, 1) provided that 

l l+c / \ 

n> 2? ' (15) 

where 

c= 4Cbe 



111 7 



The domain of (e, <5) for which our sample size bound (|15p can be used is shown by Figure [TJ 
Clearly, a sufficient but not necessary condition to use our formula (fT5|) is < e < | , < <5 < j . 

The improvement of our sample size bound (]15p upon Chernoff-Hoeffding bound (113[> is shown 
by Figure [2j It can be seen that for a typical requirement of confidence level 100(1 — 5)% (e.g., 
95%), the improvement can be 20% to 30%. 

Corollary [1] is also useful for the study of inverse binomial sampling. Let 7 be a positive integer. 
Define random number n as the minimum integer such that the summation of n consecutive 
Bernoulli random variables of common mean p £ (0, 1) is equal to 7. In other words, n is a 
random variable satisfying YlT=i X% < 7 = X^S=i^*> w here X±, X2, • • • are i.i.d. samples of 
Bernoulli random X such that Pr{A = 1} = 1 — Pr{JT = 0} = p £ (0, 1) as mentioned earlier. 
This means that n is the least number of Bernoulli trials of success rate p £ (0, 1) to come up 
with 7 successes. By virtue of Corollary [TJ we have obtained the following results. 
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Figure 1: Region of (e,S) 




Figure 2: Comparison with Chernoff bound 
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Corollary 2 



Pr{I<*}<Q + 4)exp 
Pr{2>.}<Q + Zl)exp 



^#(2, p) 



M[z,p) 



7 

for z E (0,p) such that — is an integer, 

z 

7 

for z E (p, 1) suc/i that — is an integer, 



where 



A = min 



1 C BE [z 2 + (l-z) 2 ] 
2' yW^j 



Similar to the sample size problem associated with (|13p . it is an important problem to estimate 
the binomial parameter p with a relative precision. Specifically, consider an inverse binomial 
sampling scheme as described above. Define p 7 = ^ as an estimator for p. A fundamental 
problem of practical importance is stated as follows: 

Let e E (0, 1) and 5 E (0, 1) be the margin of relative error and the confidence parameter 
respectively. How large 7 is sufficient to ensue 



Pr 





P-y-P 
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<•} 




P 





(16) 



for any p E (0, 1)? 

By virtue of Corollary[2l we have established the following results regarding the above question. 

Theorem 5 The following statements (I) and (II) hold true. 

(I) Pr I < eX > 1 - 6 for any p E (0, 1) provided that e>0, < 5 < 1 and 



7 > 



+ 



mi 



(1 + e) ln(l + e) - e 6' 
(II) Pr I < ej > 1 - 5 for any p E (0, 1) provided that < e < I, 

3e 3 (4 + e) +4e(3 + e)ln2 



(17) 



< 5 < exp 
and 



4(9 - 6e - 2e 2 ) 



3e 2 (4 + e) +4(3 + e)ln2 



4(9 - 6s - 2e 2 ) 



3(1 + £)(3 + e)ln 2 
2(9 - 6e - 2e 2 ) 



7 > 



+ 



In 



m — z — mz 



i + C 

(1 + e) ln(l + e)-£~" <5 ! 
u;zf/i m = 4- In t z = I + ^ 



(18) 



9 In 4 



The domain of (e, 5) for which our sample size bound (|18p can be used is shown by Figure 
Clearly, a sufficient but not necessary condition to use our formula {THJ) is < e < |, < 5 < \. 



10 




0.4 O.f 
Margin of relative error e 



Figure 3: Region of (e, S) 
5.2 Negative Binomial Distribution 

The probability mass function of a negative binomial random variable, X, is given by 

T(x + r) 



f(x,9) = Pr{X = x\p}-- 

T(x + 1) r(r) 

where r is a real, positive number, 

r + x , , s r(a; + r) 



(1-p) V = h(x) exp {v(6)T(x) - A(6)) for x = 0, 1, 2, . . . 



T(x) = 
Since 



h(x) 



1 



??(6') = rln 1 



r • • r(x + 1) r(r)' 

- holds, by Theorem [3l we have the following result. 



A{9) =r\n(e-l) 



dd 



d6 



Corollary 3 Let X\, - ■ ■ ,X n be i.i.d. samples of negative binomial random variable X parame- 
terized by 9 = -. Then, 



Pr > nz \ < 

Pr\j2T(Xi) <nz\ < 



i=l 




for 1 > z > 
for0<z< 



1 

1 

V 

1 

V 



5.3 Poisson Distribution 



The probability mass function of a Poisson random variable, X, of mean value A is given by 
f(x, A) = Pr{X = x | A} = = exp (r?(A)r(x) - A(X)) , x G {0, 1, 2, • • • } 
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where 

T(x) = x, h(x) = —, 77(A) = In A, A(X) = X. 

The moment generating function is M(t) = E[e tx ] = e _A exp(Ae*). Clearly, M'(t) = Xe t M(t) and 
K[X] = M'(0) = A. It can be shown by induction that 

for £ = 1, 2, ■ ■ ■ . Hence, 



= AjJ(l + 2 i " 1 A) 
t=o i=1 



4 

m\X - A| 2 ] = A, E[|X - A| 4 ] = J] ( ■ ) (" A ) l]E P^] = A(3A 3 + 8A 2 + 3A + 1) 

i=0 

and 

eI[|X-A| 4 ] / „ , 1\ 3/4 , . 

-yi! U = 3A 2 + 8A + 3 + - . 19 

eI[|x-a| 2 ] V V 

Since — = A holds, making use of (|19|) and Theorem 2, we have the following results. 

Corollary 4 Let Xi,--- ,X n be i.i.d. samples of Poisson random variable X of mean value X. 
Then, 

Pr /£k^, 2|A U(i +4 )(|£y /or2£A , 



{ s^i< i]x un +A \ (|£) 



where 



2 ' -y/n \ z 



5.4 Hypergeometric Distribution 

The hypergeometric distribution can be described by the following model. Consider a finite 
population of N units, of which there are M units having a certain attribute. Draw n units from 
the whole population by sampling without replacement. Let K denote the number of units having 
the attribute found in the n draws. Then, K is a random variable possessing a hypergeometric 
distribution such that 



(M\ (N-M\ 
■ I \ n- k ) 

o 



Pr{K = k} = Vfcy ;"- fcy , fc = 0,l,...,n. 



It can be verified that 



Pr{K = k + l\ Mi} 
Pr{K = k + l\ M } 



Pt{K = k I Mi} 
Pt{K = k I M } 



-1 



(Mi-fc)(JV-M -TO + A: + l) 
(M - k)(N - Mi - n + k + 1) ~ 



for Mi > Mo, which implies that the hypergeometric distribution possesses the MLRP. Conse- 
quently, applying Theorem [H we have the following results. 
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Corollary 5 Let M = M{k) be a function of k E Ik, which takes values in {m E Z : k < m < 
N}. Then, 

Pi{K < k I M} < [ n ~ k J for keI K such that M(k) < M, 
Pr{K > I Ml < .1 n ~jL /or k £ Ik such that M(k) > M. 

\k)\ n-k ) 

Actually, a specialized version of the inequalities in Corollary [5] had been used in the 15-th 
version of our paper [2] published in arXiv on August 6, 2010 for developing multistage sampling 
schemes for estimating population proportion p. Moreover, the specialized inequalities had been 
used in the 20-th version of our paper [3] published in arXiv on August 7, 2010 for developing 
multistage testing plans for hypotheses regarding p. 



5.5 Hypergeometric Waiting-Time Distribution 

The hypergeometric waiting-time distribution can be described by the following model. Consider 
a finite population of N units, of which there are M units having a certain attribute. Continue 
sampling until r units of certain attribute is observed or the whole population is checked. Let n 
be the number of units checked when the sampling is stopped. Clearly, in the case of r > M, it 
must be true that Pr{n = N} = 1, since the whole population is checked. In the case of r < M, 
the random variable n has a hypergeometric waiting-time distribution such that 

/N—n\ 
\r-l) \M-r) 



Pr{n = n\ M} 



for r < M and r < n < N . It can be shown that 



p r {n = n + 1 | Mi} 
Pr{n = n + 1 | M } 



p r {n = n \ Mi} 
Pr{n = n | M } 



N -n- M x +r 
= 1 —— > 1 

N - n - M + r 



for Mo < Mi, which implies that the hypergeometric waiting-time distribution possesses the 
MLRP. Hence, by virtue of Theorem [TJ we have the following results. 

Corollary 6 Let M = M(n) be a function ofn E I n , which takes values in {m E Z : r < m < N}. 
Then, 

(N\ (N-n\ /M\ /N-M\ 

Pr{n <n\M}< 4 4f = ~ ^ for n E I n such that M(n) > M, 

1 ~ 1 1 ~ (N\(N-n-, ( M\(N-M\ J W- > 

\M) \M-r) \r)\ n-r ) 



(N\ ( N — n\ /M\ (N-M\ 

\M) \M-r) _ Ifil n-r ) 

(N\fN-n\ ~ ( M\(N-M\ 

\M) \M-r) \r)\ n-r ) 



Pr{n > n | M} < )™ J v g~^ = for n E I n such that M(n) < M. 
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5.6 Normal Distribution 

The probability density function of a Gaussian random variable, X, with mean /i and variance a 2 
is given by 

tt \ 1 ( \ x -^ 2 

WW = ^ 

where 



/(x;/i) = -^=- exp I - L -^f L - ] = *W »P (r0)T(x) - /1(f))) . 



Since dA }P = ®—^p- holds, by Theorem 2, we have the following results. 

Corollary 7 Let X±,--- ,X n be i.i.d. samples of Gaussian random variable X of mean fi and 
variance a 2 . Then, 

Pr { < < \ exp (-^^) ^ z < M, 
Pr | Xi > z| < i exp (2 2 ~ 2 M)2 ) / rz>^. 

It should be noted that the inequalities in Corollary [7] may be shown by using other methods. 
However, the factor | cannot be obtained by using Chernoff bounds. 

5.7 Gamma Distribution 

In probability theory and statistics, a random variable X is said to have a gamma distribution if 
its density function is of the form 

"(k)9 k 

where 6 > 0, k > are referred to as the scale parameter and shape parameter respectively, and 



f{x) = T , k \Q k ex P {-]]) = h ( x ) ex P ( 7 ?( 6I ) T ( X ) ~ A ( 9 )) for < x < oo 



h(x) = ^, r(x) = |, V(d) = ~, A(B) = k\nO. 

The moment generating function of X is M(t) = K[e tx ] = (1 — 6t)~ k for t < ^. It can be shown 
by induction that 



d e+1 M(t) _(k + 1)6 d l M{t) = {k+ d l M{t) 



dt^ 1 i-et dt l ' L 

for £ = 0, 1, 2, • • • . Therefore, 



=^ +i n(^+o 

t=0 i=0 



4 . .x 

E[|X - k6\ 2 } = k9 2 , E[\X - k6\ 4 } =J2\) (-^)^[X 4 - 1 ] = 3k{k + 2)9 
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and 

Ei[\X-k9\ 4 } ( 6\* 



,3 + 7 • (20) 
Since = 9 dr, jp holds, making use of (|20j) and Theorem 2, we have 

Corollary 8 Let X\, ■ ■ ■ ,X n be i.i.d. samples of Gamma random variable X of shape parameter 
k and scale parameter 9. Then, 

Pr j^X;>p^j < Q + [pexp(l-p)] kn forp>l, 
Pr <pk9^ <{^ + A ) [pexp(l-p)] kn for0<p<l, 



a ■ J 1 A G^Cbe 
ZA = mm < - , 3 H — 



2 V fey 



where 



It should noted that the chi-square distribution of k degrees of freedom is a special case 
of the Gamma distribution with shape parameter | and scale parameter 2. The exponential 
distribution of mean 9 is also a special case of the Gamma distribution with shape parameter 
1 and scale parameter 9. If the shape parameter k is an integer, then the Gamma distribution 
represents an Erlang distribution. Therefore, the bounds in Corollary [8] can be used for those 
distributions. 



Let 6 = ^'f 1 — -. In order to find the sample size n such that Pr 

kn r 

have established the following result. 



0-9 



< e9 \ > 1 - 5, we 



Theorem 6 Let e > and < 8 < 1. 77ien, Pr { - (9 < e<9 j > 1 - 5 if n > ^ 



lni±i 



fc[e+ln(l+e)] 



£ , m , where 



fc/ \ In 



i 



5.8 Student's t-Distribution 



If the random variable X has a density function of the form 

r("±i) 

/(re) = — 5— — , for — 00 < x < 00, 

V^P(§)(l + ^) (n+1)/2 



then the variable X is said to posses a Student's i-distribution with n degrees of freedom. 

Now, we want to bound the tail probabilities of the distribution of X. Define Y = 9\X\, where 
9 is a positive number. Then, Y is a random variable parameterized by 9. For any real number t, 

Pr{y <t} = PtUX\ < ^\ = 2 Jj f(x)dx-l. 
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By differentiation, we obtain the probability density function of Y as fy(t,9) = 1/ (4). Note 
that, for 6q < 9\, 

f ft\ / 1 1 x (n+l)/2 

Mt,Qi) = / w = *° [ i + RZjL) 
f Y (t,e ) e lf (t\ e 1 { 



which is monotonically increasing with respect to t € Iy. This implies that the likelihood ratio 
^y(t'go) * s mon °t omcan y increasing with respect to Y. Therefore, by Theorem [TJ 

Pr{|X| >x}= Pr{y > x9} < l f &l = ^TT- =*( 

for x > 1. Similarly, 

Pv{\X\ <x}= Pr{y < x0} < l f (fl = =x( 

for < x < 1. By differentiation, we can show that the upper bound of the tail probabilities is 
unimodal with respect to x. In summary, we have the following results. 

Corollary 9 Suppose X possesses a Student's t- distribution with n degrees of freedom. Then, 

, -i \ (n+l)/2 

n + 1 x v " 



^ X ^ X \^T^) forx>l, 
n + 1 \ (" +1 )/ 2 



Pr{|X| < x} < x( = /or < .x < 1, 

\n + x z J 

where the upper bound of the tail probabilities is monotonically increasing with respect to x £ (0, 1) 
and monotonically decreasing with respect to x G (1, oo). 

5.9 Snedecor's F-Distribution 

If the random variable X has a density function of the form 

/( " ) = r(f)r(f)(i + ^)(^)/ 2 ' for °<-<°°' 

then the variable X is said to posses an F-distribution with m and n degrees of freedom. 

Now, we want to bound the tail probabilities of the distribution of X. Define Y = 6X, where 
9 is a positive number. Then, Y is a random variable parameterized by 9. For any real number t, 



f — 
Pr{y < t} = Pr he < l - J = Jj f{x)dx. 



By differentiation, we obtain the probability density function of Y as fy(t,9) = jjf (|). Note 
that, for 9q < 9i, 

fy(t,9 1 )_9 f{£) _(9 \^ ( l+ h-h\ {n+m) ' 2 



fy{tM 9 lf (t\ \0J y ^ t + ± 
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which is monotonically increasing with respect to t G Iy- This implies that the likelihood ratio 
^y(*'gpj is monotonically increasing with respect to Y. Therefore, by Theorem 1, 

lf(t) _ _ Tm /2 (n±T X (m+n)/2 

X 

for x > 1. Similarly, 



Pr{X >x} = Pr{Y > x6} < f \\\ = = ^ ( ^±^-) 



Pr{* < x} = Pr { y < x0 } < Mft = ^ = ^ f n±m 

for < x < 1. By differentiation, we can show that the upper bound of the tail probabilities is 
unimodal with respect to x. Formally, we state the results as follows. 

Corollary 10 Suppose X possesses an F- distribution with m and n degrees of freedom. Then, 

, / n + m \ ( m + n )/ 2 
Pr{X >x}< x m / 2 ( -^—^ forx> 1, 

\n + mx J 

Pr{X < x} < x m/2 I + ] for < x < 1, 

\ re + mx y 

where the upper bound of the tail probabilities is monotonically increasing with respect to x G (0, 1) 
and monotonically decreasing with respect to x G (1, oo). 

6 Using Probabilistic Inequalities for Parameter Estimation 

In this section, we shall explore the general applications of the probabilistic inequalities for pa- 
rameter estimation. 

6.1 Interval Estimation 

From TheoremHJ it can be seen that, for a large class of distributions, the likelihood ratio bounds 
of the cumulative distribution function and complementary cumulative distribution of random 
variable (p are partially monotone. Such monotonicity can be explored for the interval estimation 
of the underlying parameter 6. In this direction, we have developed a method for constructing a 
confidence interval for 9 as follows. 

Theorem 7 Let (f be a random variable possessing a distribution determined by parameter 6 G 0. 
Letl v denote the support of if. Let¥(., .) andG(., .) be bivariate functions possessing the following 
properties: 

(i) ¥(z, , d) is non-increasing with respect to $ no less than z G I v ; 

(ii) G(z,'&) is non- decreasing with respect to $ no greater than z G 1^; 
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(Hi) 

Pr{(p < z | 9} < ¥(z, 9) for z no greater than 9 G 0, 
Pr{(p > z | 9} < G(z, 9) for z no less than 9 G 0. 

Let 5 € (0,1). Define confidence limits L(<p,5) and U(<p,5) as functions of <p and 5 such 
that {¥(tp, U(tp, 5)) < §, G{(f,L(if,5)) < §, L((p,8) < <p < U(<p,5)} is a sure event. Then, 
Pr{L{ip, 6) <9< U(tp, 6)\9}>l-6 for any 9eQ. 

See Appendix [D] for a proof. By the monotonicity of ¥(z, 9) and G(z, 9) with respect to 9, we 
can obtain the lower and upper confidence limits L((p, 5) and U(<p, 5) by a bisection approach. In 
the context of Theorem [lj ¥(z,9) and G(z,9) have the same expression ^(z,9). 

6.2 Asymptotically Tight Bound of Sample Size 

Clearly, the likelihood ratio bound may be applied to the determination of sample size for parame- 
ter estimation. Since the likelihood ratio bound coincides with Chernoff bound for the exponential 
family, it is interesting to investigate the sample size issue in connection with Chernoff bound. 

Let a population be denoted by a random variable X. Let /i be the mean of X. Suppose that 
the distribution of X is parameterized by fi. Suppose that the moment generating function E[e* x ] 

y^n v". 

exists for any real number t. Let X n = ^ — i , where X\, ■ ■ ■ ,X n are i.i.d. samples of random 
variable X. Chernoff bound asserts that 

Pr{X n <v-e} < [Tifi-e,^, 

Pr{X n >fi + e}< [g(fi + e,fi)] n 



where 



Let e > be a pre-specified margin of absolute error. Let 5 > be a pre-specified confidence 
parameter. It is a ubiquitous problem to estimate /U by its empirical mean X n such that 

Pr{\X n -fi\ <e}>l-8. 

To guarantee the above requirement, it suffices to choose the sample size n greater than 



-^c(<5) = f max • 



lnf lnf 



InJX/i - e,/i) ' lnQ(fi + e, fi) 



It is of theoretical and practical importance to know tightness of such sample size bound. Let 
N a (5) be the minimum sample size n to guarantee Pr{|X — \i\ < e} > 1 — 5. We discover the 
following interesting result. 
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Theorem 8 




= 1. 



See Appendix [E] for a proof. This theorem implies that, for high confidence estimation (i.e., 
small 5), the sample size bound N c {5) can be quite tight. 

7 Conclusion 

In this paper, we have opened a new avenue for deriving probabilistic inequalities. Especially, we 
have established a fundamental connection between monotone likelihood ratio and tail probabil- 
ities. A unified theory has been developed for bounding the tail probabilities of the exponential 
family of distributions. Simple and sharp bounds are obtained for some other important distri- 
butions. 



To prove inequalities ([I]) and (|2|) , we shall focus on the case that X\ , • • • , X n are discrete random 
variables. First, we need to establish ([1]). For z G 3? such that i9(z) is no less than 9, the inequality 
(P) is trivially true if ^#(z, 9) is not bounded. It remains to consider the case that »dt(z, 9) is 
bounded. By the MLRP assumption, for z G 3f such that #(z) is no less than 9, the likelihood 
ratio A(y,9,'9(z)) is non-decreasing with respect to y G 3f. In other words, the likelihood ratio 
A(y, i9(z), #) is non-increasing with respect to y G i2° provided that i9(z) > 0. Hence, for z £ 3f 
such that $(2) > 0, it must be true that A((p(xi, • • • , x n ), $(z), 0) < A(z, $(z), 9) for all observation 
(xi, • • • , x n ) of random tuple (X\, • • • , X n ) such that • • • , x n ) > z. Moreover, since 9) 

is bounded, it must be true that f n {x\ , • • • ,x n ; $(z)) > for all observation (xi , • • • , x n ) of random 
tuple (Xi, • • • , X n ) such that ip(x\, • • • , x n ) > z and f n (%i, • • • , ^n! #) > 0. It follows that 



A Proof of Theorem [T] 



Pr{<p > z I 0} 



X] /n(zi, ■ ■ • ,x„;0) 



/n(*li*" ,a:„;0)>O 




/n 1 j " " " ; %n 1 0) 




s n ;0(je)) 




^ A(y?(ari, • • • ,x„),tf(z),0) x /„(xi, • • • , 



x n ;i?(z)) 



< 



^ A(<p(xi, • • • ,x„),??(z),0) x /„(xi, • • • , 



x„;$(z)) 



< 



X\ , • • • , X n , 



A(z,tf(z),0) ]T /«(*!,••• ,x n ;tf(z)) 



^(z, (9) x Pr{<£ > z I tf(z)} < J£(z, 9) 



for z G 3f such that $(z) is no less than 9. This establishes (P). 
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In order to show ([2]), it suffices to consider the case that ^(z, 9) is bounded, since the inequal- 
ity ([2]) is trivially true if ^(z, 9) is not bounded. By the MLRP assumption, for z G 3f such that 
$(z) is no greater than 9, the likelihood ratio A(y, $(z), 9) is non-decreasing with respect to y G 2f. 
Hence, for z G such that #(2) < 0, it must be true that h(ip{x\, • • • , x n ),'&(z), 9) < A(z, $(z), 0) 
for all observation (x±, • • • , x n ) of random tuple (X\, • • • , X n ) such that (p(x±, • • • , x n ) < z. More- 
over, since ^(z, 9) is bounded, it must be true that f n (x±, • • • , x n ; ~Q(z)) > for all observation 
(xi, • • • , x n ) of random tuple (X±, • • • , X n ) such that • • • , x n ) < z and f n (x\, • • • , x n ; 0) > 0. 
It follows that 

Pr{ v <z\e} = Yl /»( 



fn(*l,— ,x ni 9)>0 



X] A(<p(a;i, • ■ • !Zn),#(z),0) x / n (a;i, ■ • ■ ,x n ;-d{zj) 



< 



X! M^an, • ■ ■ ,Xn),$(z),6) x /„(xi, • • • ,x n ;-&(z)) 
< Hz,ti(z),6)x f n ( 

= A(z,tf(z),d) Yl /n(&l."- ,Xn\#{z)) 
= Jg{z, 9) x Pr{<£> < z | $(»} < 0) 

for £ G such that $(z) is no greater than 9. This proves (|2|). 

The proof of inequalities JI]) and @ for the case that X\ , • • • , X n are continuous variables 
can be completed by replacing the summation of probability mass functions with integration of 
probability density functions. It remains to show statements (i), (ii) and (iii). 

Clearly, statement (i) is a direct consequence of assumptions (a), (b) and the definition of ip(.). 
The monotonicity of 8) with respect to 9 as described by statement (ii) of the theorem can 
be established as follows. To show ^(z,9 2 ) < ^(z,8\) for 9 2 > &i > z, note that 

ss( a \ 9{z,9 2 ) gMi) , , 

Jt{Z,9 2 ) = —, r- < — ; r- = Jt\Z, B\) % 

9{z,z) g(z,z) 

where the inequality is due to the assumption that g(z, 9) is non-increasing with respect to 9 no 
less than z. On the other hand, to show ^(z,9i) < ^(z, 82) for 9\ < 9 2 < z, note that 

Jt (z, 9i) = — < — = Jt (z, 9 2 ), 

g{z,z) g{z,z) 

where the inequality is due to the assumption that g(z, 9) is non-decreasing with respect to 9 no 
greater than z. This justifies statement (ii) of the theorem. 

Finally, consider the monotonicity of J?(z, 9) with respect to z as described by statement (iii) 
of the theorem. To show J?(z 2 ,9) < ^(z±,9) for z 2 > z\ > 9, notice that 

e) = ^4 < < = ^( Zl ,e), 

g{z2,z 2 ) g{z 2 ,z 1 ) g{zi,zi) 
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where the first inequality is due to the assumption that g(z, 9) is non-decreasing with respect to 
9 no greater than z and the second one is due to the assumption that A(z, 6*0,6*1) = g^l'gl] i s 
non-decreasing with respect to z provided that 9q < 9\. On the other side, to show ^£(z2,9) > 
^{z\,9) for z\ < Z2 < 9, it suffices to observe that 

9) = ^\ < f^l < = Jtfr, 9), 

g{zi,zi) g{zi,z 2 ) g{z2,z 2 ) 

where the first inequality is due to the assumption that g(z, 9) is non-increasing with respect 
to 9 no less than z and the second one is due to the assumption that A(z,9q,9i) = ^'^j is 
non-decreasing with respect to z provided that 9q < @i- Statement (iii) of the theorem is thus 
proved. 

B Proof of Theorem [2] 

For simplicity of notations, define F(z,9) = Pr{(p n < z \ 9} and G(z,9) = Pr{(p n > z \ 9}. By 
the assumption of the theorem, ^"/ Jt " 1 '"' - Xn ^ = A(ip n ,9 n ,6). By virtue of Theorem [TJ we have 

Pr{ UXu--- ,X n ;9) a ~ A = r ^ ■ a ~ ■ Q , 

\f n (X u ... ,X n] 9 n ) 2 J 1 2 J 

= Pr {a(^„, 9 n , 9) < |, n < | #} < Pr {i%„, 0)<^O n <0\o} 
<Pr{F(^ n ,0)<f |#}<f 
for any 9 € Q. This proves Similarly, for any 9 G O, 



Pr fn(Xir-- ,X n ;9) a ~ Q A = ^ U ( g e) < « £ > 
l/„(X l5 ... ,X B ;6> n ) 2 J I 2 



A(^n, 9 n , 9) < -, 9 n > 9 I 0} < Pr [c% n , 0) < - , 9 n > 
<Pi{G(<p n ,6)<^\o}<^, 
which establishes ©. To show (JOJ) , making use of (|2|) and ([5]), we have 



pj fn(X u ... ,X n ;9) 

= Pr ( ^ ' < 2, g B < g I 4 + Pr ( ^ ' ^ < 2, fl w > 

a a 
< - + - = a 
~ 2 2 
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for any 9 £ 0. To show (J7J), making use of (j3]), we have that 

{ sup^ ee f n {Xi, ■ ■ ■ , X n ; §) 2 

< Pr ( su Pi?e.y /npfi, • • • Aj) < a < 
I su P^ee fn(X 1 , • • • , X n ;$) ~ 2 ' 

< Pr ( /n(*l,"-,*n;6>) <^<, 



[MXi,--- ,x n] e n ) 2 - j 2 

for any 9 £ 5? . To show ([8]), making use of ([5]), we have that 

p r ( "^/.ffi"-^*) <°ii,> mpJ >i, 

I sup^ 6e /„(Ai, • • • , A n ; tf) 2 
< Pr | SI " , '^ / " (A ' : ------ V " : " i < ? , n > 61 | 



\swptf ee f n (X 1 ,---,X n ;'&) 2 



\ sup^ 6 e / n (^i, ■ • ■ , X n ; i?) 2 



= p , m^w < £ j .>„!„ <£ 

\fn(X u --- ,X n ;l n ) 2 - J 2 
for any 6* G =y. To show ([9|), we use ([6]) to conclude that 

\sup tfg0 / n (Xi,--- ~ 2 J \sup tfge / n (Xi,--- ,X n ;i?) ~ 2 

fn(Xi, ■ ■ ■ , X n ; n ) 2 



for any 9 E S^. This completes the proof of the theorem. 



C Proof of Theorem [3] 



Note that Ui=i fxfruO) = [n" =1 ftfe)] x exp £" =1 T 0i) - nA(0)). By the assumption that 
dr, jp is positive for 9 € 0, we have that the likelihood ratio 

"exp {r j (e 1 )z-A(9 1 )Y 



A(z,9 , 



exp (»,(0o)« - A(9 )) 



is an increasing function of z 6 provided that 9q < 9\. Applying Theorem Q] with $(2) = z, we 
have 



Pr{6» >z\9}< 



exp (77(0)3 - A{9)) 
exp — A(z)) 



x Pr{0 > z I z} = „#(z, 9) x Pr{0 > z | z} 
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for z G 9 no less than 6 6. Similarly, Pr{0 < z \ 9} < ^(z, 9) x Pr{0 < z \ z] for z G 9 no 
greater than 9 G 9. It remains to show statements (i)-(v) under the additional assumption that 
9. For simplicity of notations, define w(z,9) = exp(r/(9)z — A(9)). Since 



A'(B) 

VW) 
A'{9) 

VW) 



« > and 



9 for 9 G 9, we have that 



dw(z, 9) 



[z-9)w(z,9) 



dq{9) 



d9 v d9 

which is positive for 9 < z and negative for 9 > z. This implies that w(z, 9) is monotonically 
increasing with respect to 9 less than z and monotonically decreasing with respect to 9 greater 
than z. Therefore, 9 must be a maximum-likelihood estimator of 9. 
Let be the inverse function of r](.) such that 



rim)) = c 



(21) 



for C G {rj(9) : 9 G 9}. Define compound function such that B(() = A(ip(()) for ( G {??(#) : 
# G 9}. For simplicity of notations, we abbreviate if)(C) as V* when this can be done without 
causing confusion. By the assumption that 



dA{0) 
dB 

dA(ip) 
dip 

dy{ip) 
dip 



de ' 



we have 



0- 



(22) 



Using ()2ip . (|22p and the chain rule of differentiation, we have 

dB(C) _ dA(ip) dip _ ^Kj/Q _ ^ijjT dn{ip) 

d( ~ # d( ~ *M dif) d( ~ *M d( 

dip dip 



(23) 



Putting £ = r](9), we have 



E 



exp ( nt{9) 



= E 



exp U^TpQ 



] exp ((C + - B(C))] dn • • • <fe„ 



= exp (nB(( + t) - nB{Q) j ■ ■ ■ j f[ [h( Xl )exp({( + t)T( Xi ) - B(( + t))} dx x ■ ■ ■ dx r , 

•* J i=l 

= cxp(nB(( + t)-nB(()). 

By virtue of ([23]) . the derivative of nB{C, + 1) — nB{C) with respect to t is 

dB(( + t) 



n- 



dt 



nif>(( + t), 



which is equal to ntp(C) = n9 for t = 0. Thus, M[0] = 9, which implies that is also an unbiased 
estimator of 9. This proves statement (i). 

Again by virtue of (|23p . the derivative of — tnz + nB(Q + t) — nB(Q) with respect to t is 



-nz + n 



dB(C + t) 
dt 



-nz + nip(C + t), 
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which is equal to for £ such that ip(j^ + t) = z or equivalently, £ + £ = 77(2), which implies 

is a convex function of £, its infimum with respect to 
ows that 



£ = 7](z) — rj{9). Since E exp [nt(0 — z 
£ £ M is attained at £ = 77(2) — r](0) . It fol 



inf E 



exp I nt(0 



2 



inf exp {-tnz + nB(C + £) - nB(C)) 

B(, ? (2)) 
^f(2,0). 



exp (—[77(2) — 77(0)]ra2 + nB(r](z)) — nB(Q) = exp (—[7/(2) — ?7(0)]re2 + nA(z) — nA(9)) 
exp (77(0)2 --4(0))' 
exp (77(2)2 — 4(2)) 

Now, consider the monotonicity of ^#(2, 0) with respect to as described by statement (iii) 
of the theorem. To show ^({z,9 2 ) < ^(z,9\) for 9 2 > 9\> 2, note that 



w(z,6 2 ) 
w(z, 2) 



< 



10(2, 2) 



1/J 



where the inequality is due to the fact that 7^(2, 0) is non-increasing with respect to 9 no less than 
2. On the other hand, to show ^#(2, 9\) < ^#(2, 02 ) for 9\ < 9 2 < 2, note that 



^#(2, 1 



W(2,0l) 
7t)(2, 2) 



< 



w(z, I 



to(2, 2) 



where the inequality is due to the fact that w(z, 9) is non-decreasing with respect to no greater 
than 2. This justifies statement (iii) of the theorem. 

Next, consider the monotonicity of ^#(2, 0) with respect 2 as described by statement (iv) of 
the theorem. To show ^#(22, 0) < *4%{z\, 9) for 22 > z\ > 0, it is sufficient to note that 



^(22, 0) 



w(z 2 ,9) 


n 

< 


' w(z 2 ,9) ' 


n 

< 


' w(z\,0) ' 




_w(z 2 ,Z 2 )_ 




_w(z 2 ,zi)_ 




_w(zi,zi)_ 





J%{z\,9), 



where the first inequality is due to the fact that w(z, 9) is non-decreasing with respect to 9 no 
greater than 2 and the second one is due to the assumption that the likelihood ratio A(z, 0o, 9%) = 
is non-decreasing with respect to 2. On the other side, to show ^(z 2 ,9) > ^#(21, 0) 



tu(z,fli) 

w{z,e ) 



:*or z\ < z 2 < 0, it suffices to observe that 
.#(2i,0) 





n 

< 


' w(zi,0) ' 


n 

< 


' w(z 2 ,9) ' 


n 


w(z 1 ,z 1 )_ 




_w(z 1 ,z 2 )_ 




_w(z 2 ,z 2 )_ 





^(z 2 ,9), 



where the first inequality is due to the fact that w(z, 9) is non-increasing with respect to no 
less than 2 and the second one is due to the assumption that the likelihood ratio A(2,0o,0i) = 
t!o(z'go) * s non ~decreasing with respect to 2. Statement (iv) of the theorem is thus proved. 
Finally, in order to show statement (v), notice that, in the course of proving that is an 
unbiased estimator of 0, we have shown that E[T(X) — 0] = 0. Hence, applying the Berry-Essen 
inequality and Lyapounov's inequality, we have that both (jlip and (|12p are true. 



24 



D Proof of Theorem [7] 



For simplicity of notations, define F v (z,9) = Pr{<p < z \ 9} and G v (z,9) = Pr{<£> > z \ 9}. By 
the assumption of the theorem, we have 

F v (z,9)<¥(z,9) foiz<9, (24) 

G v (z,6) < G{z,9) terz>9. (25) 

Making use of (|24p . the assumption that ¥(z,9) is non-increasing with respect to 9 > z, and the 
assumption that {F(y>, U((p, <5)) < |, cp < ?7(y?, <5)} is a sure event, we have 

{u(<p,6) <e} = { v < u(<p,5) < e, ¥(<p, u(<p, sj) < S -} 

C{cp<U(<p,5)<e, ¥( V ,0)< 6 -} 

c {¥> < UfaS) < 6, F v (<p,6) < d -}<z {F v (<p,6) < 1}, 

which implies that Pr{U((f,5) < 9} < Pr{i ? (/ ,(y?, 9) < |} < 4. On the other hand, Making use 
of (|25[) . the assumption that G(z,9) is non-decreasing with respect to 9 < z, and the assumption 
that {&(<p, L(cp, 6)) < |, <p > L(ip,6)} is a sure event, we have 

{L(<p,S) >9} = W> L(<p,6) > 0, &{<p,L{<p,5)) < S -} 

C{ l p>L( l p,6)>6, G(<p,ff)<-} 

C { V > L( V ,5) > 0, G v (<p,#) < 5 -} C {G V ( V ,0) < 

which implies that Pr{L((£>, <5) > 0} < Pr{G ¥ ,(y?,0) < 4} < 4 . Finally, by virtue of the established 
fact that Pr{E%>,<5) < 0} < § and Pr{L(<p,d) > 9} < f, we have Pr{L(<p,5) < < 17(^,(5) | 
0} > 1 - Pr{t/(¥>,<5) < #} - Pr{L(<?,<T) >#}>l-f-f = l-<5. This completes the proof of 
the theorem. 

E Proof of Theorem M 

Let iVb(5) be the minimum sample size to ensure that 

& S 
Pr{X n > fi + e} < -, Pr{X„ < fx - e} < -. 

Since Pr{|X„ — fi\ > e} equals the summation of Pr{X n > \x + e} and Pr{X n < fi — e}, we have 
that Pr{|X n — n\ > e} < 5 implies Pv{X n > fi + e} < 5 and Pr{X„ < /x — e} < S. Consequently, 

N a (5) > N h (25). 

Since Pr{X n > /x + e} < | and Pr{X n < /i — e} < | together imply Pr{|X n , — fi\ > e} < 5, we 
have 

N a (5) < N h (S). 
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Therefore, N h (28) < N a {5) < N h (5). We claim that lim 5 _> 



1. To show this claim, we 



define 



and 



Then. 



and 



lnPi{X n >n + e} 



lnPr{X„ <u-e} 



Q + <0, Q~<0, lnJ^M - e,fi) < 0, ln£(/x + e,/z) < 



It follows that 



lim ,., = hm ... x max 
<5^o AT b (5) AT b (<5) 

max{Q+, Q~} 



In- 



lnf 



In — e, (i) ' In + e, (J,) 



= lim 



5->-o max{ln.F(/i — e, /i), lnC* (/j + e, a)} 
By Chernoff's theorem, 



and consequently, 



lim Q + = lnQ(fi + e, /i), lim Q = In J-Y/i — e, fj,) 
8— >0 <5— >0 



limmax{Q + ,(5 } = max{ln — e, fi), hiQ(fj, + e, fi)} 
8— >0 



and the claim follows. Using the established claim, we have 

N h (5) 



lim — b ^ \ = lim 

s^o N h (26) s^o 



/ In | In j 1 

maX \lnjr( AI+e , AI ): lnS^-e.M) / 7V C (2<5) 

X iV b (2(S) 



In (5 



In 5 



ln_F(^i+£,^)' lnC?(/j— e,^t) 



Recalling iVb(25) < -/V a (<J) < N\,(5), we can conclude that lim^o 

N c (8) 



= lim — % = 1. 
<5^o In (5 



1. Finally, recalling the 



established claim that lim^o ^(aj = 1j the proof of the theorem is thus completed. 
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